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Abstract 

In the wake of the decisive impossibility result of Fischer, Lynch, and Paterson for 
deterministic consensus protocols in the aynchronous model with just one failure, Ben-Or 
and Bracha demonstrated that the problem could be solved with randomness, even for 
Byzantine failures. Both protocols are natural and intuitive to verify, and Bracha's achieves 
optimal resilience. However, the expected running time of these protocols is exponential in 
general. Recently, Kapron, Kempe, King, Saia, and Sanwalani presented the first efficient 
Byzantine agreement algorithm in the asynchronous, full information model, running in 
polylogarithmic time. Their algorithm is Monte Carlo and drastically departs from the 
simple structure of Ben-Or and Bracha's Las Vegas algorithms. 

In this paper, we begin an investigation of the question: to what extent is this departure 
necessary? Might there be a much simpler and intuitive Las Vegas protocol that runs in 
expected polynomial time? We will show that the exponential running time of Ben-Or and 
Bracha's algorithms is no mere accident of their specific details, but rather an unavoidable 
consequence of their general symmetry and round structure. We define a natural class 
of "fully symmetric round protocols" for solving Byzantine agreement in an asynchronous 
setting and show that any such protocol can be forced to run in expected exponential time by 
an adversary in the full information model. We assume the adversary controls t Byzantine 
processors for t = cn, where c is an arbitrary positive constant < 3. We view our result as 
a step toward identifying the level of complexity required for a polynomial-time algorithm 
in this setting, and also as a guide in the search for new efficient algorithms. 

1 Introduction 

Byzantine agreement is a fundamental problem in distributed computing, first posed by Pease, 
Shostak, and Lamport [23]. It requires n processors to agree on a bit value despite the presence 
of failures. We assume that at the outset of the protocol, an adversary has corrupted some t 
of the n processors and may cause these processors to deviate arbitrarily from the prescribed 
protocol in a coordinated malicious effort to prevent agreement. Each processor is given a bit 
as input, and all good (i.e. uncorrupted) processors must reach agreement on a bit which is 
equal to at least one of their input bits. To fully define the problem, we must specify the model 
for communication between processors, the computational power of the adversary, and also the 
information available to the adversary as the protocol executes. We will work in the message 
passing model, where each pair of processors may communicate by sending messages along 
channels. It is assumed that the channels are reliable, but asynchronous. This means that a 
message which is sent is eventually received (unaltered), but arbitrarily long delays are allowed. 
We assume that the sender of a message is always known to the receiver, so the adversary cannot 
"impersonate" uncorrupted processors. 
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We will be very conservative in placing limitations on the adversary. We consider the full 
information model, which allows a computationally unbounded adversary who has access to 
the entire content of all messages as soon as they are sent. We allow the adversary to control 
message scheduling, meaning that message delays and the order in which messages are received 
may be maliciously chosen. One may consider a non-adaptive adversary, who must fix the t 
faulty processors at the beginning of the protocol, or an adaptive adversary, who may choose 
the t faulty processors as the protocol executes. Since we are proving an impossibility result, 
we consider non-adaptive adversaries (this makes our result stronger). We will consider values 
of t which are = cn for some positive constant c < 3. (The problem is impossible to solve if 
t > §■) We define the running time of an execution in this model to be the maximum length of 
any chain of messages (ending once all good processors have decided). 

In the asynchronous setting, the seminal work of Fischer, Lynch, and Paterson [13] proved 
that no deterministic algorithm can solve Byzantine agreement, even for the seemingly benign 
failure of a single unannounced processor death. More specifically, they showed that any deter- 
ministic algorithm may fail to terminate. In light of this, it is natural to consider randomized 
algorithms with a relaxed termination requirement, such as terminating with probability one. 
In quick succession following the result of [13], Ben-Or [3] and Bracha [6] each provided ran- 
domized algorithms for asynchronous Byzantine agreement terminating with probability one 
and tolerating up to t < | and t < | faulty processors respectively. These algorithms feature a 
relatively simple and intuitive structure, but suffer greatly from inefficiency, as both terminate 
in expected exponential time. However, when the value of t is very small, namely 0{y/n), the 
expected running time is constant. 

This state of affairs persisted for a surprising number of years, until the recent work of 
Kapron, Kempe, King, Saia, and Sanwalani [10] demonstrated that polynomial-time (in fact, 
poly logarithmic time) solutions are possible. They presented a polylogarithmic-time algorithm 
tolerating up to — e)n faulty processors (for any positive constant e) which is Monte Carlo 
and succeeds with probability 1 — o(l) [T7]. The protocol is quite technically intricate and has a 
complex structure. It subtly combines and adapts several core ingredients: Feige's lightest bin 
protocol [12j, Bracha's exponential time Byzantine agreement protocol (run by small subsets of 
processors) [6], the layered network structure introduced in [201 121]. and averaging samplers. 

This protocol is a great theoretical achievement, but its use of samplers in particular would 
pose a challenge to anyone attempting to implement and use the protocol. The authors note: 
"For the use of these samplers in our protocols, we assume either a nonuniform model in which 
each processor has a copy of the required samplers for a given input size, or else that each 
processor initializes by constructing the required samplers in exponential time. Alternatively, we 
could use versions of the efficient constructions given in [15] at the expense of a polylogarithmic 
overhead in the overall running time of the protocol" [17J . There is no proof given for this 
alternative, and there is no further discussion of how this should be implemented. Also, having 
hard-coded copies of the samplers for a fixed size (or a small number of sizes) stored in the 
processors may significantly limit flexibility in practice, as one may want to routinely change 
the number of processors in the system. 

Additionally, it seems quite hard to adapt the techniques of Kapron et al. to obtain a Las 
Vegas algorithm and/or an algorithm against an adaptive adversary, since their protocol relies 
heavily on universe reduction to ultimately reduce to a very small set of processors. Once we 
reduce to considering a small subset of the processors, an adaptive adversary could choose to 
corrupt the entire subset. Even against a non-adaptive adversary, there is always some chance 
that the small subset we ultimately choose will contain a high percentage of faulty processors. 
This is essentially why the Kapron et al. protocol incurs a (small) nonzero probability of 
failure. We note there are other techniques that may be useful in the Monte Carlo setting but 
also seem difficult to adapt to the Las Vegas setting. For example, eliminating processors who 
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send messages which are unlikely to have been sent by good processors may be a successful 
strategy for a Monte Carlo algorithm, but a Las Vegas algorithm cannot risk eliminating many 
processors for doing things that may be done by good processors with small probability, since 
it must avoid incurring a nonzero chance of eliminating too many good processors and failing. 

Compared to the protocols of Ben-Or [3] and Bracha [6], the Kapron et al. protocol [16} 117] 
appears to be a distant point in what may be a large landscape of possible algorithms. The full 
range of behaviors and tradeoffs offered by this space remains to be explored. Many interesting 
questions persist: is there a Las Vegas algorithm that terminates in expected polynomial time? 
Is there an expected polynomial time algorithm against an adaptive adversary? Is there a much 
simpler algorithm that performs comparably to the Kapron et al. algorithm, or at least runs in 
polynomial time with high probability? 

In this work, we investigate why simple Las Vegas algorithms in the spirit of [3j [6] cannot 
deliver expected polynomial running time for linear values of t (i.e. t = cn for some positive 
constant c). More precisely, we define a natural class of protocols which we call fully sym- 
metric round protocols. This class encompasses Ben-Or [3] and Bracha's protocols [6], but is 
considerably more general. Roughly speaking, a protocol belongs to this class if all processors 
follow the same program proceeding in broadcast rounds where the behavior is invariant un- 
der permutations of the identities of the processors attached to the validated messages in each 
round. In other words, a processor computes its message to broadcast in the next round as a 
randomized function of the set of messages it has validated, without regard to their senders. 
We additionally constrain the protocols in the following way. Whenever a processor chooses its 
message randomly, it must choose from a constant number of possibilities. This means that at 
each step of the protocol, a processor will make a random choice between at most R alterna- 
tives, where R is a fixed constant. Note that the set of alternatives itself can vary; it is only 
the maximum number of choices that is fixed. We give a formal description of fully symmetric 
round protocols in Section [3j We will prove that for any algorithm in this class which solves 
asynchronous Byzantine agreement, there exists some input values and some adversarial strat- 
egy which causes the expected running time to be exponential in n, when t = cn for any fixed 
positive constant c. 

Our general proof strategy is to consider a chain of -E-round executions (for some suitably 
large value E) where the behavior of some good processors is the same between any two adjacent 
executions in the chain, and the two ends of the chain must have different decision values. This 
implies that some execution in the chain must not have terminated within E rounds. This 
is reminiscent of a strategy often used to prove a lower bound of t rounds for deterministic 
protocols in the synchronous setting (see [TT] for example). Employing this sort of strategy 
for randomized algorithms presents an additional challenge, since any particular execution may 
be very unlikely. To address this, we consider classes of closely related executions where an 
adversary is able to exert enough control over a real execution to force it to stay within a 
chosen class with significant probability. 

We view this work not as a primarily negative result, but rather as a guide in the search for 
new efficient Byzantine agreement algorithms in the asynchronous, full information setting. The 
goal of this paper is to illuminate some of the obstacles that must be surmounted in order to 
find an efficient Las Vegas protocol and to spur new thinking about protocols which lie outside 
the confines of our impossibility result without requiring the full complexity of the Kapron et 
al. protocol. We hope that the final outcome of this line of research will be interesting new 
algorithms as well as a greater understanding of the possible features and tradeoffs for protocols 
in this environment. 
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1.1 Other Related Work 



Asynchronous Byzantine agreement has also been studied in the setting where cryptographic 
primitives are available (for this, the adversary must be assumed to be computationally bounded). 
Both Rabin [23] and Toueg [25] presented solutions in this model, supposing that messages are 
authenticated by digital signatures and processors share a secret sequence of random bits sup- 
plied in advance by a trusted dealer. Both solutions terminate in a small constant number of 
expected rounds. Assuming private channels between pairs of processors, Berman and Garay [5j 
and Canetti and Rabin [9] provided additional solutions. Work in the cryptographic setting has 
ultimately led to protocols that terminate in constant expected time, have optimal resilience 
(t < and send 0(n 2 ) messages (protocols provided by Cachin, Kursawe, and Shoup [8] and 
Nielson [22]). 

In the synchronous, full-information setting, polylogarithmic round randomized protocols 
for byzantine agreement against a non-adaptive adversary were given by King, Saia, Sanwalani, 
and Vee [201121]. Ben-Or, Pavlov, and Vaikuntanathan [4J, and Goldwasser, Pavlov, and Vaikun- 
tanathan [14j . Restricting the adversary to be non-adaptive is necessary to achieve polylogarith- 
mic time protocols (for values of t which are linear in n), since Bar- Joseph and Ben-Or [2] have 
proven that any randomized, synchronous protocol against a fail-stop, full information adversary 
who can adaptively fail t processors must require at least ^ = = rounds in expectation. 

Another lower bound for randomized Byzantine agreement protocols was proven by Attiya 
and Censor [T], who showed that for each integer k, the probability that a randomized Byzantine 
agreement algorithm tolerating t faults with n processors does not terminate in k(n — t) steps 
is at least l/c k for some constant c. This bound holds even against a considerably weaker 
adversary than we are considering. 

Recent work of King and Saia |18} [T9] has provided Byzantine agreement protocols in the 
synchronous setting with reduced communication overhead, namely (D(n 3 / 2 ) bits in the full 
information model against a non-adaptive adversary [18J, and 0(y/n) bits against an adaptive 
adversary under the assumption of private channels between all pairs of processors |19j . 

The use of averaging samplers in recent protocols is foreshadowed by a synchronous protocol 
presented by Bracha [7] that assigned processors to committees in a non-constructive way. Chor 
and Dwork [10] provide an excellent survey that covers this as well as the other early work we 
have referenced. 



2 Preliminaries 

We begin by formally specifying the model and developing a needed mathematical definition. 

2.1 The Asynchronous, Full Information Message Passing Model and Ran- 
domized Algorithms 

We consider n processors who communicate asynchronously by sending and receiving messages. 
We assume that the communication channel between two processors never alters any messages, 
and that the sender of a message can always be correctly determined by the receiver. To model 
asynchrony, we follow the terminology of |13] , We suppose there is a message buffer, which 
contains all messages which have been sent but not yet received. A configuration includes 
the internal states of all processors as well as the contents of the message buffer. A protocol 
executes in steps, where a single step between two configurations consists of a single processor p 
receiving a value m from the message buffer, performing local computation (which may involve 
randomness), and sending a finite set of messages to other processors (these are placed in the 
message buffer). We note that the value returned by the message buffer is either a message 
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previously sent to p or (which means that no message is received) . The only constraint on the 
non-deterministic behavior of the message buffer is that if a single processor p takes infinitely 
many steps, then every message sent to p in the message buffer is eventually received by p. 

We suppose there is an adversary who controls some t of the processors. We assume these t 
processors are fixed from the beginning of the protocol. These will be called the faulty proces- 
sors, while the other processors will be called good processors. The faulty processors may behave 
arbitrarily and deviate from the protocol in malicious ways. The adversary also controls the 
message scheduling (i.e. it decides which processor takes the next step and what the message 
buffer returns, subject to the constraints mentioned above). Our adversary is computationally 
unbounded, and has access to the content of all messages as soon as they are sent. Based on 
this information, the adversary can adaptively decide in what order to deliver messages, subject 
only to the constraint that all messages which are sent between good processors must eventually 
be delivered. 

We model the use of randomness in a protocol by allowing each processor to sample from 
its own source of randomness, which is independent of the sources sampled by other processors 
and unpredictable to the adversary. This means that before a good processor samples from 
its random source, the adversary will know only the distribution of the possible outcomes and 
nothing more. 

We note that the outcome of a step of the protocol taken by a processor p is determined 
by the configuration before the step, the message (or 0) received by p at the beginning of the 
step, and the local randomness of p. For steps where no randomness is used, the outcome is 
determined by the prior configuration and the received message only. We let r p denote the local 
randomness of processor p sampled during a step, and we refer to e := (p,m,r p ) as an event. 
If C denotes the current configuration, then e(C) denotes the new configuration resulting from 
this event. If the message m is either or is in the message buffer (and intended for p) in the 
configuration C, then we say that e can be applied to C. We define a schedule from C to be a 
sequence of events that can be applied consecutively, beginning with C. We note that for steps 
involving non-empty randomness r p , the adversary does not have full control over the event: 
it can only choose the message scheduling, and has no control over the local randomness of 
a good processor p. In fact, when the adversary chooses a processor p to take the next step 
and the message to deliver, it cannot predict the value of r p that will be sampled when p is a 
good processor (before r p is sampled by p, the adversary only knows what distribution it will 
be sampled from). 

2.2 Adjusting Probability Distributions 

We will constrain our fully symmetric round protocols to always choose the next message ran- 
domly via some distribution on at most R possibilities, where R is a fixed constant. We note 
that the possible messages themselves can change according to the state of the processor as 
the protocol progresses: it is only the number of choices that is constrained, not the choices 
themselves. Since the probability distributions on R values can be arbitrary, we will define 
closely related distributions which have more convenient properties for our analysis. 

We let T> denote a distribution on a set S of size at most R. We let p s denote the probability 
that V places on s G S. In our proof, we will be considering t samples of such a distribution 
V. For each s G S, the expected number of times that s occurs when t independent samples of 
T> are taken is p s t. In general, this may not be a integer. We will prefer to work with integral 
expectations, so we define an alternate distribution T> on the same set S. We let p s denote the 
probability that V places on s for each s G S. The definition of V is motivated by two goals: we 
will ensure that p s t is an positive integer for each s G S, and also that p s and p s are sufficiently 
close for each s G S. 
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Since the size of S is at most R, there must exist some s* G S such that p s * > We fix 
this s*, and we also fix a small real number e > (whose precise size with respect to t,R will 
be specified later). For all s £ S — {s*}, we define p s to be the least positive integer multiple of 
j which is > max{/) s , e}. For s* , we define p s * = 1 — X^eS-ls*} Ps- 

Lemma 1. When t > R 2 and < e < — j,T> is a probability distribution on S, and p s t is 
a positive integer for each s £ S. 

Proof. By definition of p s *, we see that X^eS P~ s = ^° show that D is a valid distribution, 
it remains to prove that < p s < 1 for every s £ S. For s £ S — {s*}, p s > e > 0. Also, 
p s < max{e,p s } + \. Since s / s*, max{e,p s } < 1 — ^. Thus, p s < 1 — + j. Since t > R, 
this quantity is < 1. 

For s*, it is clear that p s * < 1, since X^ s e5-{s*} P~ s ^ ^- ^ or eacn s £ 5 — {s*}, we have 
/5 S — p s < e + 4. Therefore, 



E ^< E ( /5s + e + 7j 



se5-{s*} se5-{s*} 
Since the size of S is at most R and X^ s e<S-{s*} Ps = 1 ~~ Ps*-, we may conclude: 

p.<l-p a *+R\e+- 

seS-{s*} ^ 

Thus, 

p s * > p s * - R (e + ^ > 0, 

since e + j < and p s * > j^. This shows that T> is indeed a probability distribution on S. 

For s/s*, p s i G Z follows simply from the fact that p s was chosen to be an integral multiple 
of j. Since each p s for s £ S — {s*} is an integral multiple of ~, so is p s * = 1 — S s e<S— 0*} Ps- 
Hence p s *t is also a positive integer. □ 

We will additionally use the following consequence of the Chernoff bound. The proof can 
be found in Appendix lAl 

Lemma 2. Let D be an arbitrary distribution on a set S of at most R possible values, and let 
T> be defined from T> as above, with t = cn > (|)i? 2 and e < 7^ — \ (where c is a positive 
constant satisfying < c < <:)■ Let s £ S, and let p s ,p s denote the probabilities that V and 
T> assign to this value, respectively. Let X\, . . . , Xn c \ t denote independent random variables, 
each equal to 1 with probability p s and equal to with probability 1 — p s . Then: 



'(l-c)i 

Xi - p st 

i=l 



< e 



- Sc 3 n/ '(3(1- c)) 



where 5 is defined to be the minimum of e and ^ . 



3 Fully Symmetric Round Protocols 

We now define the class of fully symmetric round protocols. In these protocols, communication 
proceeds in rounds. These are similar to the usual notion of rounds in the synchronous setting, 
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but in the asynchronous setting a round may take an arbitrarily long amount of time and 
different processors may be in different rounds at any given time. Our definition is motivated by 
the core structure of Bracha's protocol, so we first review this structure. Bracha's protocol relies 
on two primitives, called Broadcast and Validate. The broadcast primitive allows a processor 
to send a value to all other processors and enforces that even faulty processors must send the 
same value to everyone or no value to anyone. The Validate primitive essentially checks that a 
value received via broadcast could have been sent by a good processor (we elaborate more on 
this below). Bracha describes the basic form of a round of his protocol as follows§ 

round (k) 

Broadcast(v) 

wait till Validate a set S of n — t fc-messages 
v := N(k,S) 

Here, a A;-message is a message broadcast by a processor in round k, and N is the protocol 
function which determines the next value to be broadcast (N is randomized). In Bracha's 
protocol, N considers only the set of fc-messages themselves, and does not consider which 
processors sent them. This is the "symmetric" quality which we will require from fully symmetric 
round protocols. This structure and symmetry also characterize Ben-Or's protocol [3], except 
that the Broadcast and Validate primitives are replaced just by sending and receiving. We will 
generalize this structure by allowing protocol functions N which consider messages from earlier 
rounds as well. 

Fully symmetric round protocols will invoke two primitives, again called Broadcast and 
Validate. We assume these two primitives are instantiated by deterministic protocols. In each 
asynchronous round, a processor invokes the Broadcast primitive and broadcasts a message to 
all other processors (that message will be stamped with the round number). We will describe 
the properties of the broadcast primitive formally below. To differentiate from the receiving 
of messages (which simply refers to the event of a message arriving at a processor via the 
communication network), we say a processor p accepts a message m when p decides that m is 
the outcome of an instantiation of the broadcast primitive. When we refer to the round number 
of a message, we mean the round number attached to the message by its sender. For a fully 
symmetric round protocol, a round can be described as follows: 

round (k) 

Broadcast (v) 

wait till Validate a set S of n — t /c-messages 

let S' denote the set of all validated i-messages for all i < k 

v := N(k,SUS') 

The message to be broadcast in the first round is computed as -/V(0, 6), where b is the input 
bit of the processor. As in the case of Bracha's protocol, we consider the set of messages S U S' 
as divorced from the sender identities, so the protocol function N does not consider which 
processor sent which message. Note here that we have allowed the protocol function to consider 
all currently validated messages with round numbers < k (i.e. were broadcast by their senders in 
rounds < k). In contrast, the Validate algorithm may consider the processor identities attached 
to messages. 

In summary, in each round a processor waits to validate n — t messages from other processors 
for that round. Once this occurs, it applies the protocol function N to the set of validated 

1 Bracha refers to this as a "step" |6j and uses the terminology of "round" a bit differently. 



7 



messages. This protocol function determines whether or not the processor decides on a final bit 
value at this point (we assume this choice is made deterministically) , and also determines the 
message to be broadcast in the next round. This choice may be made randomly. We note that 
choice of whether to decide a final bit value (and what that value is) only depends on the set 
of accepted messages themselves, and does not refer to the senders. 

Key Constraint on Randomized Behavior We constrain a processor's random choices in 
the following crucial way. We assume that when a processor employs randomness to choose its 
message to broadcast in the next round, it chooses from at most ill possibilities, where R is a 
fixed, global constant independent of all other parameters (e.g. it does not depend on the round 
number or the total number of processors Note that the choices themselves may depend on 
the round number, the total number of processors, etc. The messages themselves may also be 
quite long - there is no constraint on their bit length. 

Full Symmetry Fully symmetric round protocols are invariant under permutations of the 
identities associated with validated messages in each round. At the end of each round, a good 
processor may consult all previously validated messages (divorced from any information about 
their senders) and must choose a new message to broadcast at the beginning of the next round. 
It may make this choice randomly, so we think of the set S L) S' of all previously validated 
messages as determining a distribution on a constant number of possible messages for the next 
round. We emphasize that since S L) S' is just the set of the bare messages themselves, it 
also contains no information about which messages were sent by the same processors, so the 
distribution determined by S U S' is invariant under all permutations of the processor identities 
associated with messages for each round where the permutations may differ per round. 

Broadcast and Validate Primitives We now formally define the properties we will assume 
for the broadcast and validate primitives. We recall that these are assumed to be deterministic. 
We first consider broadcast. We suppose that the broadcast primitive is invoked by a processor 
p in order to send message m to all other processors. We consider the n processors as being 
numbered 1 through n, which allows us to identify the set of processors with the set [n] := 
{1, 2, . . . , n}. We will assume that for each permutation ir of the set of [n], there exists a finite 
schedule of events that can be applied (starting from the current configuration) such that at 
the end of the sequence of events, all processors have accepted the message m and that for 
each i from 1 to n — 1, there is a prefix of the schedule such that at the end of the prefix, 
exactly the processors vr(l), . . . ,ir(i) have accepted the message m. Essentially, this means that 
every possible order of acceptances can be achieved by some applicable schedule. (Note that 
within these schedules, all processors act according to the protocol.) More formally, we make 
the following definition: 

Definition 3. We say a broadcast protocol allows arbitrary receiver order if for any processor 
p invoking the protocol to broadcast a message m and for any permutation ir of [n], there 
exists a finite schedule cr^ of events that can be applied consecutively starting from the initial 
configuration such that there exist prefixes o~x, . . . ,a n = a n of such that in the configuration 
resulting from o~i, exactly the processors vr(l), . . . , n(i) have accepted m, and no other processors 
have. 

It is clear that this property holds if one implements broadcast simply by invoking the send 
and receive operations on the communication network. This property also holds for Bracha's 

2 This constraint is satisfied by Ben-Or and Bracha's protocols, since both choose from two values whenever 
they choose randomly. 
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broadcast primitive, which enforces that even faulty processors must send the same message to 
all good processors or no message at all. This property will be useful to our adversary (who 
controls scheduling) because it allows complete control over the order in which processors accept 
messages. We assume that our fully symmetric round protocols treat each invocation of the 
broadcast primitive as "separate" from the rest of the protocol in the sense that any messages 
sent not belonging to an instance of the broadcast primitive do not affect a processor's behavior 
within this instance of the broadcast primitive. 

We consider the Validate primitive as an algorithm V which takes as input the set of all 
accepted messages so far along with accompanying information specifying the sender of each 
message. The algorithm then deterministically proceeds to mark some subset of the previously 
accepted messages as "validated" . We assume that this algorithm is monotone in the following 
sense. We let W + C S + be two sets of accepted message and sender identity pairs (we use the + 
symbol to differentiate these sets of messages with senders from sets of messages without sender 
identity attached). Then if a message, sender pair (m,p) £ W + is marked valid by y(VF + ), 
then this same pair (m,p) will be marked valid by V(S + ) as well. In other words, marking a 
message as valid is a decision that cannot be reversed after new messages are accepted. 

We assume the validation algorithm is called each time a new message is accepted to check if 
any new messages can now be validated. Bracha's Validate algorithm is designed to validate only 
messages that could have been sent by good processors in each round. It operates by validating 
an accepted message m for round k if and only if there are n — t validated messages for round 
k — 1 that could have caused a good processor to send m in round k (i.e. m is an output of 
the protocol function N that occurs with nonzero probability when these n — t validated round 
k — 1 messages are used as the input set). In the context of Bracha's algorithm, where the 
behavior for one round only depends on the messages from the previous round, this essentially 
requires faulty processors to "conform with the underlying protocol" [6] (up to choosing their 
supposedly random values maliciously) or have their messages be ignored. 

In the context of protocols that potentially consider messages from all previous rounds, one 
might use a stronger standard for validation. For instance, to validate a message mj. for round 
k sent by a processor p, one might require that there are messages mi, . . . , m-fc-i for rounds 
1 through k — 1 sent by p which are validated and that there are sets of validated messages 
Si, ... , Sfc-i such that each Si contains messages for rounds < i and exactly n — t messages for 
round i, Si C Sj+i for each i < k — 1, and N(i,Si) = rrii+i with non-zero probability for each 
i from 1 to k — 1. This essentially checks that there is a sequence of sets of validated messages 
that p could have considered in each previous round that would have caused a good processor 
to output messages mi, . . . , m^ in rounds 1 through k with non-zero probability. 

Roughly, we will allow all validation algorithms that never fail to validate a message m sent 
by a good processor p when all of the previous messages sent by p and all of the messages that 
caused the processor p to send the message m have been accepted. We call such validation 
algorithms good message complete. We define this formally as follows. 

Definition 4. A Validate algorithm V is good message complete if the following condition 
always holds. Suppose that Sf C S^ C . . . C St are sets of validated messages (with sender 
identities attached) such that each S?~ contains exactly n — t round i messages and mi, . . . ,m^ 
occur with non-zero probability as outcomes of N(l, Si), . . . , N(k, Sk) respectively. Then if a 
set W + of messages (with sender identities attached) includes S~£ as well as the messages 
mi, . . . from the same sender, then V(W + ) marks m^ as validated. 

This means that if during a real execution of the protocol, a good processor p computes 
its first k messages mi, . . . ,m^ by applying N(l, Si), N(2, S2), ■ ■ ■ , N(k, S^) respectively and 
another good processor q has accepted mi, . . . , m^ from p as well as all of the messages in Sk, 
then q will validate m^. 
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We note that the use of Validate protocols which are not good message complete seems 
quite plausible in the Monte Carlo setting at least, since a Monte Carlo algorithm can afford 
to take some small chance of not validating a message sent by a correct processor. In this case, 
it would also be plausible to consider randomized validation protocols. However, since we are 
considering Las Vegas algorithms, we will restrict our attention in this paper to deterministic, 
good message complete validation protocols. 

We have now completed our description of fully symmetric round protocols. In summary, 
they are round protocols that invoke a broadcast primitive allowing arbitrary receiver order, 
invoke a Validate primitive that is good message complete, are invariant under permutations 
of the processor identities attached to the messages in each round, and always make random 
choices from a constant number of possibilities. 

4 Impossibility of Polynomial Time for Fully Symmetric Round 
Protocols 

We are now ready to state and prove our result. 

Theorem 5. For any fully symmetric round protocol solving asynchronous Byzantine agreement 
with n processors for up to t = cn faults, for c > a positive constant, there exist some values of 
the input bits and some adversarial strategy resulting in expected running time that is exponential 
in n. 

Proof. We suppose we have a fully symmetric round protocol with resilience t = cn. We will 
assume that t divides n for convenience, and also that ct is an integer. These assumptions will 
make our analysis a little cleaner, but could easily be removed. We let R denote the constant 
bound on the number of possibilities for each random choice. E will denote a positive integer, 
the value of which will be specified later. (It will be chosen as a suitable function of c, R and 
n, and will be exponential in n when c, R are positive constants.) 

We will be considering partial executions of the protocol lasting for E rounds. For conve- 
nience, we think of our protocol as continuing for E rounds even if all good processors have 
already decided (this can be artificially achieved by having decided processors send default mes- 
sages instead of terminating). The adversary must fix t faulty processors at the beginning of an 
execution. Once these processors are fixed, we divide the n processors into disjoint groups of 
t processors each, so there are j groups. We will refer to the groups as G\ through G n / t . We 
choose our groups so that exactly ct of the processors in each group are faulty. The main idea 
of our proof is as follows. Since the broadcast and validation primitives essentially constrain 
the behavior of faulty processors, we think of the adversary as controlling only the (supposedly) 
random choices of faulty processors as well as the message scheduling. This means that when 
faulty players invoke the randomized function N(i,S), they may maliciously chose any output 
that occurs with nonzero probability. In all other respects, they will follow the protocol. 

The adversary will choose the message scheduling so that the t processors in a single group 
will proceed in lockstep: the sets of messages that they use as input to N will always be the 
same in each round. This means that all t processors in a group will be choosing their next 
message from the same distribution. Since there are only a constant number of possibilities and 
the adversary controls a constant fraction of the processors in the group, it can ensure with high 
probability that the collection of messages which are actually chosen is precisely equal to the 
expectation under the adjusted distribution. More precisely, we let T> denote the distribution 
(on possible next round messages) resulting from applying N to a particular set of messages in 
a particular round, and we let S denote the set of (at most R) outputs that occur with nonzero 
probability. We define D with respect to V as in Section f2T2l Then, with high probability, once 



10 



the adversary sees the outputs chosen by the (1 — c)t good processors in the group, it can choose 
the messages of the ct faulty processors in the group so that the total number of processors in 
the group choosing each s £ S is exactly p s t. (This is proven via Lemma [2j) 

We can then consider classes of executions which proceed with these groups in lockstep as 
being defined by the set of messages used as input to N in each round by each group (as well 
as the sets of sender, round number pairs for the messages, but these pairs are divorced from 
the messages themselves). With reasonable probability, an adversary who controls the message 
scheduling and the random choices of faulty processors can force a real execution to stay within 
such a class for E rounds. We will prove there exists such a class in which some good processors 
fail to decide in the first E rounds. Putting this all together, we will conclude that there exist 
some values of the input bits and some adversarial strategy that will result in expected running 
time that is exponential in n. 

Our formal proof begins with the following definition. 

Definition 6. An E-round lockstep execution class C is defined by a setting of the input bits for 
each group (processors in the same group will have the same input bit), message sets Sj for all 
1 < i < E and 1 < j < n/t where Sj is used as the input to N in round i by each processor in 
group Gj during some real execution, and sets Zj of processor, round number pairs consisting 
of all pairs (p, k) such that the message broadcast by processor p in round k is contained in Sj . 
We require that for all i,j, if (p, k) 6 Zj and processors p and p' are in the same group Gj' , 
then (p',k) € Zj as well. 

We have required that an .E-round lockstep execution class C describe some real execution, 
but note that such an execution is not unique. There are many possible executions that cor- 
respond to the same class C. It is crucial to note that the messages in sets Sj are not linked 
to their sender, round number pairs in Zj. In other words, given the sets Zj and Sj, one has 
cumulative information about the messages and the senders, but there is no specification of who 
sent what. 

We will construct a chain Cq,C\, . . . ,Cl of i?-round lockstep execution classes with the 
following properties: 

1. For each Sj in each Ce, the number of occurrences of each message s is exactly equal to 
p s t, where p s denotes the probability on s in the distribution D defined from T> as in 
Section \2.2\ where T> is the distribution on possible messages induced by N(i — 1, Sj_±). 

2. Each Ci and C^+i differ only in the sets Sj for one group Gj. 

3. It is impossible for any good processor to decide the value 1 during an execution in class 
Co- 

4. It is impossible for any good processor to decide the value during an execution in class 
C L - 

Once we have such a chain of £J-round lockstep execution classes, we may argue as follows. 
Since each Cg and Cg + \ differ only in the behavior of processors in a single group, it is impossible 
for all good processors to have decided in an execution in class Cg and all good processors to 
have decided 1 in an execution in class Cg + \. Since the only decision value possible in Cq is 
and the only decision value possible in Cl is 1, there must be some Ci* which leaves some good 
processors undecided. In other words, any execution in this class Cg* does not terminate in < E 
rounds. 

Finally, we will show that when the input bits match the inputs for Q*, the adversary 
can (with some reasonable probability) cause a real execution to fall in class Cg*. Since E is 
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exponential in n whenever R, c are positive constants, this will prove that the expected running 
time in this case is exponential. 

We now present a recursive algorithm which generates a chain of i?-round lockstep execution 
classes with the properties required above. 

4.1 Generating the Chain of Execution Classes 

We first describe a method for generating an E'-round lockstep execution class from a setting 
of the input bits (these must be the same for all processor within a group) and a family of sets 
z\ C [re], each of size n — t, where i ranges from 1 to E and j ranges from 1 to j. Each set 
z\ will be the complement of some group Gj>. This means that z\ is a union of all but one of 
the groups, so each group of t processors is either contained or z\ or disjoint from it. Now, we 
will create an execution where for each round i and each group Gj , the n — t round i messages 
validated by the members of group Gj in round i are precisely those sent by the processors in 
the set z\ . This will correspond to an S-round lockstep execution class with sets Z\ defined as 
follows. 

We define the sets Z\ inductively. Each set Z\ consists of pairs (p,k), where p £ [n] is a 
processor and k < i is a round number. For i = 1, we simply define Z\ to consist of the pairs 
(p, 1) where p G z\. Once we have defined Z\_ x , we define Z\ to be Z\_ x plus the sender and 
round number pairs for any additional messages which may be needed to validate the round i 
messages of the processors in set z\ . More formally, 

Z{:=ZU {J{(p,k)\pezj,k<i} |J Z{_ v (1) 

j's.t.G^nzj^® 

The second set in this union corresponds to the set of messages sent by processors in z\ for 
all rounds < i, while the final set in the union contains all the sets Zj_ 1 for groups Gy that 
intersect z\. By good message completeness of our validation algorithm, this suffices to ensure 
that the round i messages from senders in z\ will be validated once all of the messages with 
sender, round number pairs in Z\ have been accepted (in fact all of the messages whose sender, 
round number pairs appear in Z\ will be validated, assuming all processors follow the protocol 
except for manipulating their supposedly random coins). We note that these sets Z\ satisfy the 
required property that if (p, k) £ Zj, (p 1 , k) £ Z\ as well as for any p' in the same group as p. 
(This follows from induction on i and the fact that this is holds for the sets zj.) 

We next use the sets Z\ to define a set of permutations (one for each processor, round 
number pair) that we will use to specify the message scheduling during an execution. 

Definition 7. A set of permutations {vr Pi j} on [n] (one for each processor p and each round 
1 < i < E) corresponding to sets {Zj} is defined as follows. For each processor p and round 
number i, we consider each group index j (from 1 to n/t). For each group Gj, there is a 
minimal round k for which (p,i) G Z 3 k (if (p,i) is not in any of these sets, then define this 
minimal k to be oo). This induces an ordering on the groups. We will define ir p ^ by setting 
7Tp ; j(l), . . . , 7r Pi j(t) to be processors in the group with the lowest associated minimal round k, 
setting ir p ^(t + 1), . . . ,ir Pt i(2t) to be processors in the group with the second lowest associated 
minimal round k, and so. (Within a group, the processors can be ordered arbitrarily. If two 
groups have the same k, their order can also be chosen arbitrarily.) 

These permutations tt p ^ will be used to specify the order in which the processors will accept 
the message broadcast by processor p in round i. Note that for any set of permutations {7r P; j}, 
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we can choose our message scheduling to achieve them, since our broadcast protocol allows 
arbitrary receiver order and separate invocations of it in our fully symmetric round protocols 
are treated independently of each other. 

We now consider running an execution as follows. In the first round, each processor in group 
Gj will choose its message to broadcast from the same distribution T>i n a over < R possibilities 
induced by N(0, bj), where bj is the input bit of processors in group Gj (we are assuming that 
the input bit is the same for all processors within a group, though different groups may have 
different inputs). We let T>i n a be defined from V>i n a as in Section [2T2l We choose a multi-set 
of t messages from the < R possible messages such that the number of occurrences of each 
is equal to t times its probability under T>i n a. (This is always possible because t times each 
probability in T>i n i t is a positive integer, by design.) We assign these messages arbitrarily to the 
processors in group Gj. (This can be thought of as assigning the random coins of each processor 
to make these outcomes happen.) For each p, we then run part of the finite schedule a n l for p's 

broadcast in round 1, stopping when exactly those processors in groups f such that (p, 1) 6 Z\ 
have accepted the message. At this point, all processors in each group Gj are ready to compute 
their round 2 messages by calling N(l, S^), where S{ contains the messages whose sender, round 
number pairs appear in Z\. We can now describe the rest of the execution by specifying what 
happens for an arbitrary round i, assuming the previous round is just completing. 

We assume (by induction) by that at the end of round i—1 when the message to be broadcast 
in round i is computed, all processors in each group Gj have accepted and validated exactly the 
set of messages Sj_ 1 corresponding to the sender, round number pairs in Z?-,. Hence, every 
processor in group Gj will be computing its round i message by applying N(i — 1, 5'^_ 1 ) for the 
same set Sj_ 1 of accepted messages. We let denote the probability distribution on the 

< R possible outputs of N(i — 1,5^__ 1 ), and we let T>ij be defined from Z>jj as in Section [221 
We choose a multi-set of t messages from the possible outcomes of N(i — 1, Sf^) such that the 
number of occurrences of each is equal to t times its probability under X>j_i,-. We assign these 
messages arbitrarily to the members of group Gj. 

Now, each of the processors invokes the broadcast protocol on its round i message. We 
begin running each corresponding finite schedule cr 7Tpi , stopping at the point where exactly 

those processors in groups j such that (p, i) G Z\ have accepted the message. Also, for every 
processor p and every round k < i, we continue running the finite schedules a n k to the point 

where exactly all processors in groups j such that (p, k) G Zf have accepted the round k message 
from p. This ensures that every processor in each group Gj will accept and validate precisely 
the set of messages Sf whose sender, round number pairs appear in Zf. We continue in this 
way through E rounds. This is a real execution that corresponds to an E'-round execution class 
with sets Zj,Sf. 

To generate our chain of executions, we begin by defining the initial Cq. This is done by 
setting all of the input bits equal to 0, and choosing sets z\ C [n] arbitrarily (each is the 
complement of a single group). The sets Zj,Sj are then derived as above. This gives us an 
i?-round lockstep execution class Cq corresponding to some real execution in which all of the 
input bits are 0. (This ensures property 3 for our chain). 

To produce the rest of the chain C%, . . . , Cx, we employ a recursive algorithm called Chain- 
Generator. The algorithm is designed to produce a gradual shift from a lockstep execution 
class with all input bits equal to to a lockstep execution class with all input bits equal to 1. 
This is accomplished by changing the inputs of one group at a time. In order to change a single 
group's inputs without affecting the behavior of other processors through the first E rounds, 
we must first move to a lockstep execution class where the messages sent by this group are not 
accepted by processors in other groups until after they have completed E rounds. We choose 
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the group size to be t in order to make this possible. (Group sizes < t could also be employed, 
but once the group size gets too small, the adversary will not have enough control to make the 
group's cumulative behavior match its adjusted expectation.) 

To reach an .E-round lockstep execution class where a particular group's messages are not 
heard by other processors, we follow an inductive strategy with round E acting as the base 
case. Suppose that we want to change the inputs for processors in group Gj. We cannot do 
this immediately if it might affect the behavior of processors outside this group in the first E 
rounds. We define i — 1 to be the earliest round in which the set zj_ l for some other group 

Gj/ includes a sender in Gj. We now seek to change the set zj_ l to be the complement of 
group Gj. Now we have a new instance of the same problem: in order to change what messages 
group Gji members accept in round i — 1 without affecting processors outside of this group, we 
must first get to a lockstep execution class where processors outside of group Gj> do not accept 
messages sent from group Gj/ with round numbers > i until they have completed E rounds. 
The important thing to notice here is that the new instance of the problem always involves a 
higher round number. Hence, we can formulate this as a recursion, and eventually we reach a 
point where it is enough to ensure that the messages of some group Gj" with round numbers 
> i' are not heard by some other group Gy« in round E. This is now easy to do, since we can 
arrange for the n — t other round E messages to be validated while we delay the messages with 
round numbers > i! from group Gj" to Gj>» , so the processors in group Gy" can exit round E. 
(Notice here that E will be the earliest round in which any group may receive messages with 
round number > i' from Gj" , and this ensures that these messages cannot be needed to validate 
the round E messages of processors outside Gj".) 

More concretely, to change the inputs of some group G gi from to 1, we begin by initializing 
a list of group number, round number pairs with the element (gi, 1). Having a pair (ge, re) as 
the last element of our list means that our goal is to arrive at sets zj such that for all j / ge 
and all i > re, z\ is the complement of group G ge . (In other words, the messages that group 
G ge sends in rounds > re are not heard by processors in other groups.) If our sets z\ do not 
currently satisfy this, we add a pair (ge+i, fe+i) to the list where r^ +1 — 1 is the minimal value 
of i > re such that some z{ with j / ge includes group G ge (and ge+\ is the corresponding j 
value for such a set zj). Now our (sub)goal is to arrive at sets z\ where the messages sent by 
group G ge+1 in rounds > re+i are not heard by processors in other groups. Once this holds, 
we can change the set z 9 ^_ 1 to be the complement of group number ge, and we can remove 

the pair {ge+iife+i) from the list. Now, there may be other sets z\ with i > re and j ^ ge 
including group G ge that we will need to deal with next. However, since we always consider 
the minimal such i, we will not undue the progress we have made by changing z 9 ^_ 1 in the 
process of addressing these other sets. Since there are a finite number of groups and we are 
considering a finite number of rounds, this process will always eventually terminate. 

The full description of the recursive function and the proof that it produces a suitable 
chain of E-round lockstep execution classes is below. The function takes in three arguments: 
a specification of n input bits (denoted x\,... ,x n ), a family of sets {zj}, and an ordered list 
C of pairs: each pair contains a group number and a round number between 1 and E + 1. We 
denote the k th element of the list by (<7&, r fc)) where gk is the group number, and is the round 
number. The round numbers in the list will always be strictly increasing. The first element of 
the list will always be of the form (g±, 1). We denote the size of the list by \C\. The input bits 
x\,...,x n will always be consistent within each group. 

There is also a required relationship between the ordered list and the sets z\ . For each pair 
(dk-i, r k-i) on the list that is followed by a pair (gk,fk), round r& — 1 must be the earliest 
round > r^-i in which any set zj for a group Gj ^ G gkl includes the group G gkl . In other 
words, for all r^-i < i < r^ — 1 and all j / gk-i, zj is is the complement of group G gkl . This 



14 



relationship will be maintained in the arguments to the recursive calls the function makes to 
itself. 

We initially call our recursive function ChainGenerator with input bits all equal to 0, the 
sets z\ used in defining Co, and the list initialized to (1,1). The function then proceeds to call 
itself with new arguments. Each time a change is made to the input bits and/or to the sets zf, 
we produce a new execution class generated as above from the new input bits and sets. 

4.1.1 The Recursive Algorithm 

ChainGenerator((xi, . . . ,x n ), {z?},£) We set £ = |£|. If £> 2, we proceed as follows. We 
examine the last element of the list C, denoted by (g£, r^). We consider two possible cases. Case 
1 occurs when none of the sets z\ for values of j ^ ge and ri < i < E include group Case 2 
occurs when there is some z\ for j ^ ge, r£ < i < E that does contain group ge- 

We first consider case 1. We define new sets z\ as follows. For all j ^ gi, we set z\ = z\ for 
all i (these sets are unchanged). For j = ge, we set zf l = z\ for all i ^ re — 1. We define z 9 r l x to 
be the complement of group ge-\- These new sets are then used (as above) to derive sets Zj, Sf 
corresponding to an E'-round execution class C, using the (unchanged) input bits (x%, . . . ,x n ). 
We output C as the next E'-round lockstep execution class in the chain. We remove (gi, re) from 
the list C to form a new list C We then call ChainGenerator((a;i, . . . , x n ), {z\}, C). 

We observe the following. We know that round number re — 1 is the minimum of all round 
numbers i > re-i such that some z\ for j ^ <^_i includes group gi-i. (This follows from the 
required relationship between the ordered list C and the sets z?.) Thus, when we create the new 
sets Zj from the new z^'s, the set Z^_ 1 will no longer include any sender, round numbers with 
senders in group ge~\ and round numbers > re~\. This ensures that the members of group ge 
can now proceed through round re — 1 without accepting any messages from group G ge _ 1 with 
round numbers > re-i- 

To confirm that our constraints on the input arguments are satisfied, note that £ is a sublist 
of C, so its round numbers remain strictly increasing. Also, we have only changed the sets z\ 
in rounds i > Ti — 1, so if L still has size at least twco, we have preserved the fact that round 
re-i — 1 is the earliest round in which any set z\ for j ^ gt-2 and % > ?7_2 includes group 
number ge-2- (Note that r^i — 1 < r% — 1.) 

We now consider case 2. In this case, there is some j ^ ge, r£ < i < E such that z\ does 
contain group ge- Among these i, j values, we fix a pair (i* where i* is minimal. We define 
r e+i = i* + 1- We note that r^+i > r^. We define gi + \ = j*. We append the pair (ge+i, ?7+i) to 
the list C to form the new list £ . We then call ChainGenerator ((xi, . . . , x n ), {zj}, £'). Note 
that in this case, the input bits and the sets {zj} are unchanged, so {zf},C still satisfy our 
requirements by construction. 

We are left to handle the case of i = 1. In this case, we have a single pair (g\, 1) in the list. 
We again consider two cases. In case 1, none of the sets z\ for j ^ g\ and 1 < i < E include the 
group g\. In case 2, there is some z\ for j ^ g\ that does include a pair with sender in group 
91- 

We consider case 1. We first change the input bits x±, . . . , x n to new bits x±, . . . , x n by setting 
all of the input bits for processors in group g\ to be 1 (the other inputs remain unchanged). 
We leave the sets {zj} unchanged. Using the input bits x±, . . . ,x n and the sets z\, we derive 
sets Z\, S\ corresponding to an E'-round lockstep execution class C. (Note that the sets Z\ are 
unchanged, because they only depend on the zj's and not on the input bits.) We output C as 

3 Note that this constraint is vacuous when \C\ = 1. 
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the next -E-round lockstep execution class in the chain. If g\ = n/t, we terminate. Otherwise, 
we define the new list C to be {(<7i + l, 1)}, and we call ChainGenerator((xi, . . . , x n ), {zf}, £). 

We now consider case 2. In this case, there is some j 7^ g±, 1 < i < E such that z\ does 
contain group g\. Among these i,j values, we fix a pair (i*,j*) where i* is minimal. We define 
T2 = i* + 1. We note that r2 > r\ = 1. We define 52 = j*- We append the pair (52, f 2) to 
the list C to form the new list £ . We then call ChainGenerator((xi, . . . , x n ), {zf}, C). Note 
that in this case, everything except the list is unchanged, and £ satisfies our requirements by 
construction. This concludes the description of the algorithm. 

4.1.2 Proof of Correctness for the Algorithm 

We now prove that this algorithm produces a chain of -E-round lockstep execution classes with 
the desired properties 1 through 4. 

Lemma 8. When called with the initial arguments (0, 0, ... , 0), the sets {zj} for Co, and C := 
{(1, 1)} ; the function ChainGenerator eventually terminates and produces a chain of E -round 
lockstep execution classes satisfying properties 1 through 4 listed above. 

Proof. Property 1 is satisfied by construction. Property 3 follows from the fact that Co is taken 
from a real execution in which the input bits of all good processors are 0; a good processor 
in such an execution deciding on the value 1 would violate the correctness conditions of the 
protocol. 

To prove property 2, we consider how adjacent classes in the chain are generated. A new 
class is produced when exactly one of two things happen: either the sets z\ change, or the input 
change. When the sets z\ change, it is actually only one set z n -i that changes. This means 
that all of the sets Zf for i < rg — 1 are exactly the same for the two adjacent classes. In fact, 
the sets Z? are exactly the same for all i < E for all j 7= gg, since we have made sure that there 
are no sender, round number pairs with senders in group g£ and round numbers > in any of 
these Zf's. Since the input bits are unchanged, this means that all of the sets for j / gi will 
be the same for the two adjacent classes. If instead it is the input bits of a group j' that have 
changed between the adjacent classes, then the sets z\ and Z^ are the same, and none of the 
Zj's for j 7^ f include any senders from group f . Thus, the change in this group's inputs does 
not affect the behavior of processors outside the group through the first E rounds, and we have 
sets Si which are identical for all i whenever j 7= f . This establishes property 2. 

Now we must prove termination and property 4. First, we note that termination implies 
property 4. To see this, consider the termination condition. Termination occurs precisely when 
the list C is equal to {(n/t, 1)} and the final E-round lockstep execution class produced (call 
this Cl) has input values of 1 for all the members of the final group. To reach this point, the 
algorithm must have gone through calls where the list was of the form {(j, 1)} for each j from 
1 to n/t. The only way for the algorithm to get from a list of {(j, 1)} to a list of {(j + 1, 1)} 
is to produce a sequence of intermediary E-round lockstep execution classes which begin with 
the input bits of group j all being and ends with these input bits all being 1. Once these 
inputs are changed to 1, they are never changed back. Thus, if the protocol terminates after 
producing Cl, then Cl must correspond to an execution in which all of the input bits of good 
processors are equal to 1. Thus, termination implies property 4. 

Finally, we prove the algorithm terminates. We consider the way the list C evolves as the 
algorithm runs. When a pair is added to the list, its round number is always strictly greater than 
the round number of the previous list element. These round numbers will never exceed E + 1. 
To verify this, note that when the round number of the last list entry (5%, r^) is = E + 1, the 
condition that no sets z\ with j 7^ and i > E + 1 include group is trivially satisfied, since 
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we end at round E. Thus, the list cannot grow at this point, and instead we are guaranteed to 
remove its last element. We have thus shown that pairs with round number E + 1 are always 
guaranteed to be removed from the list. 

We now employ induction on the round number. Suppose that at some point during the 
running of the algorithm, we have a list whose last pair has round number r, and all pairs with 
round numbers strictly higher than r are guaranteed to be eventually removed from the list. 
One of two things can happen next: either we will remove the last pair with round number r, 
or we will add a new pair to the list with round number greater than r. The round number of 
this new pair, which we will call r' , is chosen to be minimal. We let gg denote the group number 
of this new pair. By the inductive hypothesis, we know that this new pair will eventually be 
removed from the list (since r' > r). From the time that we added the pair with round number 
r' to the point when we remove it, all of the sets z\ for rounds i < r' — 1 remain unchanged, 
and for i = r' — 1, the only set that changes is zp_ v 

Next, we will either remove the pair with round number r from the list, or we will add a 
new pair with the new minimal round number, r" . Since we have left the sets z\ for all j and 
all i < r' — 1 unchanged, this new minimal round number must satisfy r" > r', and if r" = r', 
the new group number cannot be equal to gg, since we have already fixed the set z 9 J_ v As we 
continue adding and removing new pairs with round numbers > r' , this fix will not be undone. 
Hence, since there are a finite number of groups, we will eventually progress to a point where 
the new minimal round number is > r' . This minimal round number will continue increasing 
upward, but it cannot exceed E + 1. This means that at some point, we will remove the pair 
with round number r. 

We may conclude that all pairs added to the list are eventually removed. Applying this to 
the pairs which are added with round number equal to 1, we see that the inputs for each group 
will eventually be changed from to 1, which guarantees that the process will terminate, with 
all inputs equal to 1. □ 

4.2 Completing the Proof of Theorem 

We consider our chain of -E-round lockstep execution classes Co, . . . ,Cl satisfying properties 1 
through 4. Among these, there is some Cg* which results in some good processors remaining 
undecided after E rounds. We now use Lemma [2] to complete our proof. We recall that 
this lemma shows that when the good processors in a group each sample their next message 
independently from the same distribution D on at most R possibilities, with high probability 
the adversary can choose the "random" bits of the faulty processors in the group to ensure 
that the number of times each possible message is chosen within the group exactly matches the 
expected number under distribution D. 

We let 5' denote the value 5c 3 /(3(l - c)) appearing in the statement of Lemma [2j We note 
that 5' is a positive constant which can be chosen to depend only on R and c (recall the e is 
chosen with respect to R and c). We consider an execution which begins with the same input 
bits as Cg*. As the execution runs, the adversary will choose the message scheduling and the 
supposedly random bits for the faulty processors in an attempt to create message sets through 
the first E rounds that match the sets S\ associated to Cg*. We note that the scheduling can 
be chosen according to permutations 7r Pi j for each processor p and each round i derived from 
the sets Z\ for the class Cg* as described in Definition More precisely, in each round i, we 
run (or continue running) parts of the finite schedules a n t , for all i' < i and stop when exactly 

those processors in groups j with (p, i') £ Z\ have accepted each message. 

In order for the adversary to be successful in creating an execution that falls into class Cg* , 
it must ensure that the messages chosen in each round by each group conform precisely to the 
expected numbers for each possibility under the corresponding distribution V. This can be 
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done as long as the number of good processors in the group choosing each possibility s do not 
exceed the expected number, p s t. When this occurs, the adversary can set the messages of the 
faulty processors in the group so that each expectation is matched precisely. We note that the 
sets S\ always contain all of the round k messages sent by a group or none of them (recall we 
have required that if (p, k) €. Zj for any processor p, any rounds i, k, and any group j, then 
(p' , k) G Z\ for all processors p' in the same group as p). Thus, as long as the adversary achieves 
the desired multi-set of messages for each group, the sets Sf of Ci* will be attained. (It does 
not matter which processor from each group sends which message, as long as the multi-set of 
messages produced by each group matches the specification of Q*.) 

Since there are at most R possible messages for each group and there are n/t = 1/c groups, 
the union bound in combination with Lemma [2] ensures that the probability of the adversary 
failing in any given round is at most -f e~ 5 n . Thus, the adversary will succeed in producing the 
sets Si associated with Cg* through E rounds with probability at least 1 — i^ e ~ 5 n . When the 
adversary succeeds, some good processors will remain undecided at the end of E rounds. 

We now fix the value of E as: 

E := —e s ' n . 
2R 

This is exponential in n, and the probability that the adversary can force the execution to last 
for at least E rounds is > |. This proves that the expected running time is exponential. This 
completes our proof of Theorem [5l □ 

5 Directions for Future Work 

We have proven that for any fully symmetric round protocol, there are some input values and an 
adversarial strategy that will force the execution to run for an exponential number of rounds with 
constant probability. This results in an exponential expected running time in general for values 
of t which are linear in n. Our work leaves many interesting open questions and illuminates 
several potential directions for future work on understanding the range of possible behaviors for 
randomized Byzantine agreement algorithms in the asynchronous, full information setting. We 
hope that the restrictions we placed on fully symmetric round protocols in order to implement 
our proof strategy may provide useful clues for where one should look when searching for 
polynomial expected time algorithms (particularly Las Vegas algorithms). Informally speaking, 
we may ask: how far does one have to go beyond the realm of fully symmetric round protocols 
in order to find an expected polynomial time algorithm? Does one have to abandon symmetry 
completely? Or might one deviate from our specifications in more subtle ways? 

Weaker Symmetry For instance, we could consider an enlarged class of protocols that is 
symmetric in a weaker sense: behavior could still be invariant under permutations of the pro- 
cessor identities attached to accepted messages, but these permutations could be fixed for the 
entire history of previous rounds, instead of allowed to change per round. We do not know 
whether our impossibility result can be extended to protocols exhibiting this weaker kind of 
symmetry. 

More Randomness It is also intriguing to consider the small change of lifting the restriction 
on the number of random choices. Though our probabilistic analysis is not nearly optimized, 
it does seem fairly sensitive to the number of possibilities considered when a processor makes 
a random choice. Having more choices will considerably decrease the adversary's chances of 
arranging the numbers of all outcomes to conform with their adjusted expectations. However, 
it is not clear how to leverage using more randomness to achieve faster Las Vegas algorithms. 
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Use of Round Structure and Primitives It is also worth considering if the seemingly 
natural notion of a round (imported from the synchronous setting) may have a restrictive 
effect on our thinking in the asynchronous setting. Chor and Dwork describe the nature of 
an asynchronous round as follows: "...there is something akin to a round even in a completely 
asynchronous system. Consider a set of n processors running a protocol tolerant to t faults, 
and let p be a correct processor in this set. If p broadcasts the message for its i th step in the 
protocol and receives step i messages from only n — t processors, then p cannot safely wait for 
additional step i messages because all t processors from which it has not heard may be faulty. 
In this case, p must proceed to step i + 1 in its protocol, and we say p has completed round 
i" [TO]. This reasoning is convincing, but also a bit deceptive. For protocols like Ben-Or's 
[3] and Bracha's [6] which consult only the current round's messages, this reasoning is sound, 
but for protocols which may use more of the execution history, the situation is more subtle. 
For example, consider a processor who is in round 2 and has received round 2 messages from 
processors 1 through n — t. Suppose it previously received round 1 messages from processors 
t + 1 through n. Then there are a total of It processors that it has failed to hear from so far, 
so it may safely wait to receive t more messages for rounds 1 and 2 combined before moving to 
round 3. We have allowed for this sort of behavior in our validation primitive, but there could 
be other subtle violations of our restrictive notion of round behavior that could allow protocols 
to avoid our impossibility result. 

This issue is related to our requirements for the Broadcast and Validate primitives. It is 
possible that one might leverage instantiations of these primitives with stronger properties or 
employ wholly new primitives to avoid our result without acquiring considerably more complex- 
ity in the high-level algorithm. 
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A Proof of Lemma [2] 

Proof. We first consider s ^ s*. In this case, p s < p s , and p s > e. We define new independent 
random variables X\, . . . ,X(i_ c ^ t which are equal to 1 with probability p s and equal to with 
probability 1 — p s . Since p s < p s , we have that: 



\l~c)t 

^2 Xi> p s t 

i=i 



< 



\l-c)t ^ 

Xi - p» l 

i=i 
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We note that 



p s t = p s (l- c)t 



1 



1 



E 



1-cJ \l-c 
Since = 1 + and -^- c < 1, the Chernoff bound yields: 



(l-c)t 
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< e -p s c 2 t/(3(l-c)) < -ecV^l-c)) 



We now consider s*. Then p s * > i, and p s * > p s » — i?(e + 7). We then have: 
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Since p s * > j^, this quantity is: 

>(rb) (i-^ + ia)) 

Since e was chosen so that e + j < ^gz, we have 

(_l_) (1 _, v+1/f))£ (_l_) (l _|). 1 + _^_. 

Hence, by the Chernoff bound (since < 2 (i- c ) — we nave: 
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Since S := min{e, jn}, we have shown that 



(l-c)t 

E ^ ^ P»* 
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holds in all cases. 
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